Locally discriminative topic modeling
نویسندگان
چکیده
Topic modeling is a powerful tool for discovering the underlying or hidden structure in text corpora. Typical algorithms for topic modeling include probabilistic latent semantic analysis (PLSA) and latent Dirichlet allocation (LDA). Despite their different inspirations, both approaches are instances of generative model, whereas the discriminative structure of the documents is ignored. In this paper, we propose locally discriminative topic model (LDTM), a novel topic modeling approach which considers both generative and discriminative structures of the data space. Different from PLSA and LDA in which the topic distribution of a document is dependent on all the other documents, LDTM takes a local perspective that the topic distribution of each document is strongly dependent on its neighbors. By modeling the local relationships of documents within each neighborhood via a local linear model, we learn topic distributions that vary smoothly along the geodesics of the data manifold, and can better capture the discriminative structure in the data. The experimental results on text clustering and web page categorization demonstrate the effectiveness of our proposed approach. & 2011 Elsevier Ltd. All rights reserved.
منابع مشابه
Hallucinating system outputs for discriminative language modeling
Project overview • NSF funded project and recent JHU summer workshop team • General topic: discriminative language modeling for ASR and MT – Learning language models with discriminative objectives • Specific topic: learning models from text only – Enabling use of much more training data; adaptation scenarios • Have made some progress with ASR models (topic today) – Less progress on improving MT...
متن کاملOpen Domain Short Text Conceptualization: A Generative + Descriptive Modeling Approach
Concepts embody the knowledge to facilitate our cognitive processes of learning. Mapping short texts to a large set of open domain concepts has gained many successful applications. In this paper, we unify the existing conceptualization methods from a Bayesian perspective, and discuss the three modeling approaches: descriptive, generative, and discriminative models. Motivated by the discussion o...
متن کاملMarkovian Discriminative Modeling for Dialog State Tracking
Discriminative dialog state tracking has become a hot topic in dialog research community recently. Compared to generative approach, it has the advantage of being able to handle arbitrary dependent features, which is very appealing. In this paper, we present our approach to the DSTC2 challenge. We propose to use discriminative Markovian models as a natural enhancement to the stationary discrimin...
متن کاملDiscriminative Bi-Term Topic Model for Headline-Based Social News Clustering
Social news are becoming increasingly popular. News organizations and popular journalists are starting to use social media more and more heavily for broadcasting news. The major challenge in social news clustering lies in the fact that textual content is only a headline, which is much shorter than the fulltext. Previous works showed that the bi-term topic model (BTM) is effective in modeling sh...
متن کاملDiscriminative Random Field Modeling of Lung Tumors in CT Scans
The ability to conduct high-quality automatic 3D segmentation of tumors in CT scans is of high value to busy radiologists. Discriminative random fields (DRFs) were used to segment 3D volumes of lung tumors in CT scan data. Optimal parameters for the DRF inference were first calculated using gradient ascent. These parameters were then used to solve the inference problem using the graph cuts algo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Pattern Recognition
دوره 45 شماره
صفحات -
تاریخ انتشار 2012